How clustering affects the bond percolation threshold in complex networks 
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The question of how clustering (non-zero density of triangles) in networks affects their bond perco- 
lation threshold has important applications in a variety of disciplines. Recent advances in modelling 
highly-clustered networks are employed here to analytically study the bond percolation threshold. 
In comparison to the threshold in an unclustered network with the same degree distribution and 
correlation structure, the presence of triangles in these model networks is shown to lead to a larger 
bond percolation threshold (i.e. clustering increases the epidemic threshold or decreases resilience 
of the network to random edge deletion). 

PACS numbers: 89.75. He, 64.60. aq, 64.60.ah, 87.23. Ge 



I. INTRODUCTION 

Clustering (or transitivity) in a complex network refers 
to the propensity of two neighbors of a given node to 
also be neighbors of each other, thus forming a triangle 
of edges within the graph. In a recent paper [l| New- 
man proposes a model of random networks with cluster- 
ing which permits analytical solution for many important 
properties. An alternative model, based on embedding 
cliques in a locally tree-like structure, was subsequently 
proposed by one of us @|. One of the most important 
predictions of these models is the effect of network clus- 
tering on the bond percolation process, which is a topic 
of considerable interest [3H12I . 



The bond percolation problem for a network may be 
stated as follows: each edge of the network graph is vis- 
ited once, and damaged (deleted) with probability 1 — p. 
The quantity p is the bond occupation probability and the 
non-damaged edges are termed occupied. In an infinite 
graph, the size of the giant connected component (GCC) 
of the graph becomes nonzero at some critical value of 
p > 0: this critical value of p is termed the bond per- 
colation threshold, denoted pth- The bond percolation 
problem has applications in epidemiology, where p is re- 
lated to the average transmissibility of a disease and the 
GCC represents the size of an epidemic outbreak [l4| , 
and in the analysis of technological networks, where the 
resilience of a network to the random failure of links is 
quantified by the size of the GCC [7\. Analytical so- 
lutions for percolation on randomly-wired networks and 
on correlated networks are well-known [l5l - l20j , but these 
cases have zero clustering in the limit of infinite network 
size. 

Newman solves the bond percolation problem within 
his model [l[ and considers the effect of clustering on the 
bond percolation threshold. He gives an example where 
clustering decreases the value of pth within the context 
of a certain set of networks which all share the same av- 
erage degree (see Fig. 2 of Q). However, Newman notes 
that the networks in his comparison set, while having 
the same average degree, do not all have the same de- 
gree distribution (see Section IIIII for further discussion 



of this point). Miller J21| recently showed analytically 
that within the model [l| the bond percolation threshold 
in a clustered network is greater that the corresponding 
threshold in an unclustered network with the same degree 
distribution and correlation structure. A similar conclu- 
sion was reported by Kiss and Green [ll[ based on their 
numerical simulations using Newman's clustered bipar- 
tite graph model [1| . In this paper we focus on networks 
generated by the clique-based model 0] and show that 
the effect of clustering is qualitatively similar to that de- 
termined by Miller for the triangle-based model [![, i.e., 
the presence of clustering increases the bond percolation 
threshold (and hence the epidemic threshold) when net- 
works with the same degree distribution and correlation 
structure are compared. We emphasize that the degree- 
degree correlation structure in the clustered network in- 
cludes non-trivial correlations beyond nearest-neighbors, 
and we consider the implications of this fact. 

We begin by introducing the recently published mod- 
els for clustered random networks, and in Section |TT] we 
apply these to random regular graphs. Networks with 
heterogeneous degree distributions are examined in Sec- 
tions |HT] and |lVl and conclusions are drawn in Section fVl 
Extended mathematical calculations are relegated to the 
appendices. 

We first briefly review two recent models for infinite 
random networks with non-zero clustering. The fun- 
damental quantity describing the networks of [2| is the 
joint probability distribution j(k, c), giving the proba- 
bility that a randomly-chosen node has degree k and is 
a member of a c-clique (a fully-connected subgraph of 
c nodes). In these networks, nodes may be part of at 
most one clique. Nodes which are members of a c-clique 
have c — 1 edges linking them to neighbors within the 
same clique. They also have an additional k — (c — 1) 
neighbors who are not in the same clique as themselves 
(note "f(k, c) = for c > k + 1 since nodes in a c-clique 
must have at least c — 1 neighbors). Edges which are 
not internal to a clique are termed external links. The 
degree distribution P/. of the network (probability that 
a random node has k neighbors) is obtained from 7 by 
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averaging over all possible clique sizes: 



fc+i 



(1) 



networks in this class, and in Section Hill (see, for exam- 
ple, Fig. [3]) we illustrate the interaction between cluster- 
ing and correlation common to both models of clustering. 



and the degree- dependent clustering coefficient Ck [28[ is 
given in terms of 7 by 
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see for details. The overall network clustering coeffi- 
cient C [22j is then C — J2k>2 Pk c k- 

Analytical results for the giant connected component 
size are given in Q and the bond percolation threshold 
pty is shown to be the solution of the following polyno- 
mial equation for p: 
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Here z e is the average number of external links per node: 
z e = J2k — c + 1)7(^1 c )? z c is the average degree of 
nodes in cliques of size c: z c — J2k k"/(k, c)/ J2k 7(^' c )> 
and D c (p) = p2m=i( m_ ^( m l c ) are polynomial func- 
tions of p. The functions P(m\c) give the probability that 
a node in a c-clique belongs to a connected cluster of m 
nodes within the clique, including itself; these polynomial 
functions of p are defined and tabulated in Q . 

A different approach to modelling local clustering is 
taken in Newman's model |l| (see also [2l|). The joint 
distribution p Syt gives the probability that a randomly- 
chosen node is connected to s single edges (similar to the 
external links of the 7-theory networks) and to t trian- 
gles. The degree distribution is then given by 
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(4) 



and the clustering coefficient, GCC size, and bond per- 
colation threshold (denoted p^ for Newman's model) 
may all be determined analytically (see [l|, [2l[ and Ap- 
pendix EJ. 

It is instructive to compare the constraints imposed on 
the network structure in each of these models. In New- 
man's model, a fc-degree node may be a member of up 
to \k/2\ disjoint triangles, and thus have a local clus- 
tering coefficient of up to l/(fc — 1) if k is even, or up 
to 1/fc if k is odd. In contrast, nodes in the 7-theory 
networks can be members of only a single clique, but 
using large cliques can give arbitrarily high clustering. 
In Section [TT] we show that both models imply p t h is in- 
creased by clustering on random regular graphs — this has 
recently been demonstrated for the case of triangle-based 
networks [l| by Miller [2l|, but we focus on the case of 
higher-clustering 7-theory networks. A special class of 
clustered networks are those whose nodes may belong to 
at most one triangle. Both models [HQ are applicable to 



II. RANDOM REGULAR GRAPHS 

In this Section we restrict our attention to random z- 
regular graphs, i.e., random graphs in which all nodes 
have the same degree z. As shown in [18] random graphs 
with zero clustering (in the limit N — ¥ 00 of infinite 
number of nodes) may be generated using the configura- 
tion model (2j| 24], for which the percolation threshold 
is given in terms of the degree distribution Pk as 



M) = 
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(5) 



For random regular graphs the degree distribution is sim- 
ply Pk — 8k, 2, and the zero-clustering percolation thresh- 
old is = jij. 

Next we employ Eq. ^ to consider the effect of non- 
zero clustering in regular networks generated using the 
algorithm of [2j. In [2| a parametrization of 7 (fe, c) is 
suggested which is consistent with ([1} and allows the 
clustering to be easily adjusted: 
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This is a binomial distribution of the probability mass 
for fc-degree nodes across the c-clique classes for c from 
1 to fc + 1, governed by the parameter gk- Substituting 
([6]) into ((2]) gives the remarkably simple relation Ck = g\ 
between the degree- dependent clustering coefficient and 
the parameter gk- For the random regular graphs un- 
der consideration here, 7(fc, c) is nonzero only for k = z 
and setting g z = \fC in (j6]) allows us to investigate regu- 
lar graphs with clustering coefficient C covering the full 
range [0,1]. 

Figure fflja) compares the bond percolation thresh- 
old p$ in clustered 7-theory networks (determined by 
numerical solution of the polynomial Eq. ([3]), using 
parametrization (|6])) with the zero-clustering threshold 

l/(z— 1). We also show (magenta dash-dot curves) 



Pth 



the percolation threshold p^P given by Newman's model 

[l[ , and the symbols show the threshold p+J found from 
an earlier bipartite-graph model of clustering [3] , see Ap- 
pendix [S] for details. It is clear that all three cluster- 
ing models give thresholds which are larger than p^V 
for C > 0, i.e., clustering increases the bond percolation 
threshold in these random regular graphs. Support for 
this statement in the case of 7-theory networks is given 
in Appendix [Bj The correspon ding result for pY^' follows 
from the recent work of Miller [2l| . 

Analytical expressions determining the size S of the gi- 
ant connected component in 7-theory networks are also 
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FIG. 1: (Color online) (a) Bond percolation threshold in 
z-regular graphs with clustering C, generated using the algo- 
rithms of 0| (pil , black solid), Q} {p\^\ magenta dash-dot), 
and 0] {Vth '> blue symbols). For comparison, the threshold 
p[]^ in an unclustered ^-regular graph is shown by the red 
dashed line. Note = p[^ — = p^ when C = 0, but 
the clustered cases all have p t h values exceeding p^ when 
C > 0. Values of z are z = 3 (top), z — 4 (middle), and z = 6 
(bottom), (b) Sizes of GCC S(p) in z — 4 7-theory regular 
graphs with clustering coefficients as shown. 



given in [2[ and Fig. |Tfb) shows S as a function of bond 
occupation probability p for z = 4, using parametriza- 
tion ©. As already noted, increased clustering leads to 
higher values of the transition point , but also leads 
to smaller GCC sizes. 

Having established that the presence of clustering in- 
creases pth in several models of clustered regular graphs, 
in the remainder of this paper we will consider how di- 
versity of node degrees also plays an important role. 



III. HETEROGENEOUS NETWORKS 

Networks with a range of node degrees may be charac- 
terized at first order by their degree distribution Pk or, 
at second order, by the joint probability P(k, k') that a 
randomly-chosen edge links vertices of degree k and k' . 
Analytical results for the percolation threshold are known 
for the ensembles of networks described fully by Pk [13] 
or by P(k, k') with respective thresholds denoted 
and ! see © an d Appendix [C] 

In this section we compare the bond percolation 
threshold p^P for various clustered networks with the 

values p\^ and pf^ corresponding to zero-clustering net- 
works with the same degree distribution, or same degree- 
degree correlations as the clustered network. Our first 
example is a Poisson random network with degree distri- 
bution Pk = e~ z z k /k\ and mean degree z = 2. Fig- 
ure HJa) compares from Eq. ([3]) with p^ = l/z 
(2) 

and p th , the latter being determined using the joint 
distribution P(k, k') for 7-theory networks derived in 
Appendix [C] The clustering level of the 7-theory net- 
works is controlled using the parametrization ©, with 
<7k = \JC j(\ — Pq — Pi) for all k, so that the average 
clustering coefficient J2k>2 PkCk is equal to C. Note 

that the p^ line (and p^ curve) show the thresholds 
in unclustered networks with the same degree distribu- 
tion (and P(k, k') distribution) as the 7-theory network 
with clustering C. 

We see that pty is larger than both of the zero- 
clustering thresholds p99 and pi? , consistent with our 
claim that clustering increases the bond percolation 
threshold. The fact that p$ is less thanp^ is due to the 
assortativity of the 7-theory networks, see Appendix [C] 
and 0. 

Figure E^b) shows the GCC size S in the 7-theory 
network (black solid curve) as a function of p for clus- 
tering C = 0.3. Also shown are the GCC sizes in a 
zero-clustering network with the same degree distribu- 
tion Pk (red dashed curve) and with the same P(k, k') 
distribution (blue dash-dot curve). This figure can be 
compared to Fig. 2 of [l[ where higher-clustering cases 
seem to have lower percolation thresholds than the zero- 
clustering case. However, it should be noted that the fo- 
cus in [l[ is on a different comparison to that undertaken 
here. The cases plotted in Fig. 2 of [l| are generated 
from a double Poisson p St t distribution (see Eq. (13) of 
[l|) and all share the same mean degree z, but not the 
same degree distribution. In short, we compare clustered 
networks with unclustered versions with the same Pk (or 
P(k,k')), while Newman's comparison in [lj retains a 
common form for the joint distribution p s j, but does not 
conserve the degree distribution. A similar analysis ap- 
plies to Fig. 2 of 0, where again it may be shown that 
the clustered networks used have percolation thresholds 
larger than those of unclustered networks with the same 
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FIG. 2: (Color online) (a) Bond percolation threshold in 
7-theory networks with Poisson degree distribution, 2 = 2, 
and clustering C (black solid). For comparison, also shown 
is the threshold in an unclustered network with same 
degree distribution (red dashed), and the threshold in an 
unclustered network with the same degree-degree correlations 
(blue dash-dot) as the 7-network. (b) Sizes of GCC S(p) for 
the case C = 0.3 in 7-theory networks (black solid), and in 
unclustered networks with the same degree distribution (red 
dashed), or same degree-degree correlations (blue dash-dot). 



degree distribution. In fact this has been demonstrated 
numerically by Kiss and Green IllJ, who compared the 
GCC sizes for the networks of [3| with the GCC sizes in 
rewired versions of these networks. 

Having examined the results for regular graphs and 
Poisson random networks, one might be tempted at this 

point to conclude that is always greater than p^ 

(2) 

and p th ■ However, the situation is rather more compli- 
cated than this, as demonstrated in Fig. [3] and discussed 
(for Newman's triangle-based networks) in [2l|. To facil- 
itate analysis, and to enable the application of both the 
7-theory [2| and Newman's theory [i| , we restrict our at- 
tention now to the special class of networks in which each 
node has either zero local clustering, or is part of a single 
triangle. In terms of the 7-theory, this means j(k, c) = 
unless c = 1 or c = 3. For Fig. [3] we have also used a 
particularly simple degree distribution, with exactly half 
the nodes having degree k = 2 and the other half having 
degree k = 3. The networks examined are thus described 



with the theoretical models as follows 

7(2,1) = p 2 ,o = ^(1 - a); 7(2,3) =p ,i = ^a, 

7(3, 1) = P 3 ,o - \{l ~ 0); 7(3, 3) = PM = i/3, (7) 

with the parameters a and /3 controlling the level of clus- 
tering for each degree class. 

Figure [3] shows that p^JJ (which equals p[^ in this spe- 
cial class of networks) may lie either below (Fig.[3Ja)) or 

above (Fig. EIc)) the zero-clustering thresholds p^ and 

p th . Recall our claim is that the presence of triangles 
increases pth relative to its value in unclustered networks 
with the same degree distribution and same correlation 
structure. In the next section we show that the correla- 
tion structure in these examples is not fully described by 
only nearest-neighbor correlations as given by P{k,k'). 
When, as described in Section llVl the correlation struc- 
ture is fully matched but clustering eliminated, the GCC 
size S(p) is given by the magenta (dotted) curve in Fig. [3] 
Note the transition point for the black (solid) curve is 
larger in all cases than the transition point for the ma- 
genta curve, supporting our claim. Detailed analysis of 
the correlation structure for these cases is given in Sec- 
tion [TV] and Appendix [E] 



IV. UNCLUSTERED NETWORKS WITH 
CORRELATION STRUCTURE 

In this section we restrict our attention to the special 
class of 7-theory networks wherein nodes are members of 
either one clique or of none, and all cliques are of equal 
size c = c (the example in Section IllTI used c = 3), i.e., 



j(k, c) = Pfe(l - ak)5d + PkakSc 



(8) 



for a prescribed degree distribution P^, and with ctk de- 
termining the level of clustering for degree-k nodes. Note 
that the theoretical approaches of jl| and |2| both apply 
in the case c = 3. 

To understand the correlation structure of these net- 
works we visualize each edge of a network as being col- 
ored either green or red (compare to the approach for 
the triangle-based Newman model taken recently in 21]). 
The rule for edge-coloring is simple: all edges which 
form part of a c-clique are colored red, while the re- 
maining edges (the external links in the 7-theory nota- 
tion) are all colored green, see Fig. Eta) for an example 
with c = 3. Now consider the following rewiring process, 
which preserves the correlation structure, but destroys 
the clustering within the network. First, break each edge 
into two end-stubs with each stub retaining the color of 
the original edge. We now have N isolated "hedgehog" 
nodes, each with a set of colored stubs as its "spines" , see 
Fig. 0|b). The network is then reconnected together by 
randomly selecting pairs of green stubs to be joined with 
a green edge, and similarly randomly pairing red stubs 
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FIG. 3: (Color online) Sizes of GCC S(p) for the 7-theory 
networks defined by (black solid) and in unclustered net- 
works with the same degree distribution (red dashed), or same 
degree-degree correlations (blue dash-dot). The magenta dot- 
ted curve is for the colored-edge (unclustered) networks de- 
fined in Section HVl Parameters are a — 0.9, with (a) /3 = 0.1, 
(b) p = 0.4, and (c) /3 = 0.5. 



with red edges. The construction method for the original 
7-theory (or Newman theory) involves a similar joining of 
like-colored stubs, except that the randomly chosen red 
stubs are gathered into c-cliques. By simply joining pairs 
of red stubs at random we retain the degree-degree cor- 
relation structure (including correlations beyond nearest- 
neighbor) of the 7-theory network, but eliminate trian- 
gles (in the N — > 00 limit). The resulting network, which 
we dub the colored- edge network, has properties which 
are influenced by the fact that red and green stubs are 
not randomly distributed among the nodes. Taking c = 3 
for example, each node is a member of or 1 triangle, so 
we know that each node must have either exactly zero or 
exactly two red edges linked to it, while a node of degree 
k has either k or k — 2 green edges. These constraints 



(a) ^ _S j 


V" 




(b) 

* 


A: > 



FIG. 4: (Color online) Segment of a clustered network with 
clique edges colored red (thin lines) and external links colored 
green (thick lines). After breaking each edge to obtain col- 
ored stubs as in (b), a realization of a colored-edge network 
is created by randomly connecting pairs of stubs of the same 
color. 



mean the correlation structure of the colored-edge net- 
work is not completely described only by the nearest- 
neighbor correlations (i.e., by the P(k, k 1 ) distribution of 
Appendix [Cj . A worked example showing this correla- 
tion structure is given in Appendix |DJ 



Despite the non-trivial correlation structure, the lack 
of clustering permits the application of standard tree- 
based approaches to find the GCC size and the bond 
percolation threshold pji^ for colored-edge networks gen- 
erated from 7-theory networks with the single non-trivial 
clique class c — c (see [2l[ for the case c = 3, and Ap- 
pendix [E] for the general c case). The magenta dotted 
curve in Fig. [3] shows the GCC size for the colored- 
edge networks. In Appendix lEl we show analytically that 

vth* — Pt li ' i- e -' t na, t the clustering in the original net- 
work causes it to have an increased bond percolation 
threshold compared to the colored-edge network with the 
same correlation structure. However, the relative order- 
ing of pf^ 1 and p)^ (or p^) — and hence the ordering 

of Pth 1 Pth, Pih — depends on the details of the correla- 
tion structure beyond nearest-neighbors, so the fact that 
p^ exceeds p^' does not guarantee it will exceed p^ , 
see Fig. |3{a) for an example. Further work is needed to 
elucidate the effects of the correlation structure on p t h 
in these unclustered networks, but we believe the effect 
of clique-based clustering has now been clearly separated 
from this question. 
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V. CONCLUSIONS 

We have shown that within the context of the clique- 
based model of @ , clustering increases the bond percola- 
tion threshold in comparison with its value for networks 
with (i) the same degree distribution and (ii) the same 
correlation structure. In Section [TT| we used three differ- 
ent approaches for constructing random regular networks 
with clustering, and confirmed that pth is increased by 
the presence of clustering, both in triangle-based net- 
works (as shown in [2jJ)and also in the highly-clustered 
clique- based models of Q (as first demonstrated in [TTj ) 
and (see Fig. Q] and Appendix [B]). In Sections Ultl 
and IIVI we highlighted the importance of condition (ii) 
by showing that the nth-nearest-neighbor correlations af- 
fect pth even in the absence of clustering, i.e., networks 
with identical nearest-neighbor correlations (as given by 
the P(k, k') distribution) can have differing p t h due to 
correlations beyond nearest neighbor. The nth-nearest- 
neighbor correlations are therefore also important when 
investigating the effects of clustering within various mod- 
els. When these correlations are fully accounted for, our 
result remains valid (see Fig. [3] and Appendix [El) . 

What should be our intuitive understanding of the ef- 
fects of clustering? We believe the correct viewpoint was 
in fact given by Newman [l[ when discussing the giant 
component size in the case p — 1: "the triangles that give 
the network its clustering contain redundant edges that 
serve no purpose in connecting the giant component to- 
gether". In other words, the redundant edges cause the 
GCC size in a clustered network to be smaller than (or 
at most equal to) the GCC of an unclustered network 
with the same correlation structure, thus explaining the 
observation that clustering decreases the value of S(l) in 
the Newman model [2l| . All our results indicate that in 
fact S^\p) < S^(p) for all p in [0, 1], i.e., that cluster- 
ing reduces the GCC size for all values of p (compared, 
as usual, to an unclustered network with same correla- 
tion structure), not just for p = 1. Our main result, 



that p[j^ > pfh' , may be seen as a simple consequence 
of this fact: since the GCC size in the clustered net- 
work is smaller than (or at most equal to) that in the 
unclustered network for all p, the transition point where 
the clustered GCC size becomes nonzero must be larger 
than the transition point for the unclustered network. 
We therefore believe that Newman's explanation of clus- 
tering as adding redundant edges reveals the essence of 
the matter. 

In the recent paper [2l|, Miller independently derives 
the triangle-based clustering model of |l| . He also demon- 
strates that within the context of this model, cluster- 
ing increases the bond percolation threshold in the same 
sense as claimed here (i.e., when compared to an un- 
clustered network with identical correlation structure). 
Our work is complementary to (2lJ, since we show that 
the qualitative effect of clustering seen in triangle-based 
networks (i.e. clustering increases pth) is also present 
in more heavily-clustered networks described by clique- 



„(«0 



based theory (compare our results in Appendices |B] and 
Ewith those in HJ). 

The application of these results to real- world networks 
remains a significant challenge. In this paper it was pos- 
sible to separate the effects of clustering and the related 
correlation structure within the theoretical models [l|, [4] , 
but it is not clear how this might be attempted for a given 
real-world network or indeed for other theoretical mod- 
els with clustering. Nevertheless, the understanding that 
within the models 0, [|j clustering (as distinct from re- 
lated correlation effects) leads generically to an increase 
in the bond percolation threshold marks, we believe, an 
important step forward. 
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Appendix A: Other clustering models 

Newman's results [l| may be used to derive the follow- 
ing polynomial equation for the bond percolation thresh- 
old p — p^P in networks described by the joint distribu- 
tion p Si t (see also (2l|): 

2p(l + p-p 2 )(p((s 2 - s)(t 2 - t) - (st) 2 ) - (s)(t 2 - t» 

-p(s 2 -s)(t) + (s)(t) =0, 
(Al) 

where s and t specify respectively the number of single 
edges and triangle edges attached to a vertex, and (■) 
denotes the average over the joint distribution p s .t- For 
random z-regular graphs we assume the following distri- 
bution of probability mass: 



Ps,t 



LfJ 
t 



'(!-£?) Li- 1 "* fort = 0to 



(A2) 



and calculate the clustering C in terms of the single pa- 
rameter g using the results of The magenta dash- 
dotted curves in Fig. Q] show pr?' as a function of C. 

Another analytically solvable case of clustered ran- 
dom regular graphs is provided by Newman's bipartite 
graph model [3j. In this model, nodes may be part of 
some number of groups (cliques), and the structure may 
be represented as a bipartite graph with links between 
nodes (individuals) and the groups (cliques) of which 
they are members. In general this model cannot be fit- 
ted to desired degree distributions, but the special case 
of z-regular graphs may be produced by taking the dis- 
tribution of group sizes to be s n — S ntU , and the number 
of groups in which a node partakes to be distributed as 
I'm = ^ot,jU) where integers v and /1 satisfy the relation 
[y— l)/i = z. For the case z = 6, for example, there exist 



7 



3 such (y, n) pairs: (2,6), (3,3), and (4,2), leading to 
respective clustering coefficients of 0, 1/5, and 2/5. The 
formulas given in [3| allow us to calculate the bond per- 
colation threshold for each of these cases, and the results 
are plotted with symbols in Fig. [TJa) . Consistent with 
the models of [3, [2j , the percolation threshold is clearly 
increased above its unclustered value in this model. 



Appendix B: Clustering increases ptft in random 
regular graphs 

Here we demonstrate that for random ^-regular graphs 
generated using the 7-theory 0, the bond percolation 

threshold p\£ is larger than the value p^ = 1/(2 — 1) 
for an unclustered network. We show this for a general 
7 (/c,c) distribution, so the result is not dependent on a 
particular parametrization such as 

Note from (J3]) that p^ is the solution of the polyno- 
mial equation F(p) — 1 where 

F{p) = — ^ (2-0 + 1)7(2,0) x 

Ze c 

x(p{z-c) + {z-c+l)D c (p)), (Bl) 

with z e = J2 c {z — c + 1)7(2, c). We use the following 
two properties of the polynomials D c (p): (a) D c (p) is 
a monotonically increasing function of p on the interval 
[0, 1] with D c (0) = 0, and (b) D c (p) is bounded above by 



D C ( P ) < 



P 2 {c-l) 
l-p(c-2) 



(B2) 



for all p with < p < . 

By property (a), the polynomial F(p) defined in (|B1[) 
is monotonically increasing in p, with P(0) = 0. Since 

F (pth^j = 1' we can guarantee that pj^ < p^ by show- 
ig that F {pthj < 1- Using property (b), we have that 



for p < min c (l/(c — 2)), 
1 



F(p) < f £> _c + 1)7(2, c) 



Substituting p = p$ = 1/(2 — 1) (note this p obeys 
p < l/(c— 2) for all relevant cliques classes since c < 2 + 1 
in a 2-regular graph) simplifies the right-hand side to 
yield 

F (pth) ^ 7 X>-c+i) 7 (*,c) 



z, 

= 1 



(B4) 



hence implying that p^ > p^ as desired. 



Appendix C: Degree-degree correlations in 7-theory 
networks 



The ensemble of networks characterized by 7 (fc, c) is 
constructed as described in . To determine the degree- 
correlation matrix P(k, k 1 ) we calculate the probability 
that a randomly-chosen edge of the network joins to- 
gether nodes of degree k and k'. The construction algo- 
rithm for the 7 (/c, c) network is based on specifying stubs 
(half-edges) as either external stubs or c- clique stubs. 
Since each fc-degree node in a c-clique has k — c + 1 exter- 
nal stubs and c — 1 c-clique stubs, the number of external 
edges in the network (half the number of external stubs) 
is given by 



k,c 



(Cl) 



where N is the number of nodes. Similarly, the total 
number of c-clique edges is 

S c= y$>-1)7(M, for Ol. (C2) 

k 

The sum over all c-clique classes, plus the external edges, 
gives the total number E of edges in the network: 



E = E P 



Y,E c =\Nz. 

01 



(C3) 



Therefore a randomly-chosen edge of the network is an 
external edge with probability E e /E = and is a 

c-clique edge with probability E c /E = . Then the 
global P(fc, k') matrix may be written as 

P(M') - §P e (fc,fc')+^§P c (fc,fc') 

Ol 

= a Wp e (k,k')+J2<x {c) Pc(k,k% (C4) 
01 

where P e (fc,fc') is the probability that a randomly cho- 
sen external edge joins nodes of degrees fc and fc', and 
P c (k,k') is similarly defined for c-cliques edges. 

Suppose first that the chosen edge is an external 
edge. Since external edges are composed of randomly- 
connected external stubs, the probability that an end- 
vertex is of degree k is 



1^ 



(k - c+ l) 7 (fc,c) 



(C5) 



and the probability that the chosen external edge links 
nodes of degrees k and k' is 



p e (fc,fc , ) = 4 1) 4' ) - 



(C6) 



If the chosen edge is a c-clique edge, the probability that 
an end- vertex is of degree k is 



,(c) _ (c- l) 7 (fc,c) 



7 (fc,c) 



£ fc „(c-l) 7 (fc",c) E fe »7(fc'V)' 



(C7) 
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and the probability that the chosen c-clique edge links 
nodes of degree k and kl is 



P c (k,k') = s^sff forol. 



(C8) 



Inserting (|C6j) and (|C8[) into ()C4[) enables us to write the 
global P(k, k') matrix for the network as 

P(k,k') = a^s^+^a^aP 



01 



E 

C>1 



(C9) 



(2) 

We can then calculate p th , the bond percolation thresh- 
old in an unclustered network with the same degree- 
degree correlations as the original network [l9|, |25j . as 

(2) 

p th = 1/X max , where X max is the largest eigenvalue of 
the matrix C with entries given by 



E^P(M') P(fc ' j) - 



(CIO) 



Moreover, we can see that 7-theory networks are nec- 
essarily assortative by showing that 



J2 kP(k, k')k' - \J2 fcP ( fe ' k ')\ ( C11 ) 

This quantity determines the sign of the Pearson corre- 
lation coefficient r defined in Eq. (3) of [25|], with posi- 
tive values corresponding to assortative networks. Using 
(|C9|) . the left-hand side of (|C11|) may be written as 



(C12) 



where x c = ^ fc ks^ and J2c a ^ = 1> so tms expression 
may be rewritten as 



(C13) 



Since all terms are non-negative the inequality (jCllj) 
must hold, and the 7-theory networks are assortative. 

We emphasize the fact that asortativity follows here di- 
rectly from the decomposition (|C9|) of P(k, k') into dis- 
joint parts, each of which has the form of a randomly- 
connected network. In Newman's recent clustering model 
for example, there are also two types of links: those 
which are edges of triangles, and those which are not. 
Stubs of each of these two types are randomly connected 
to stubs of the same type — it follows that the P(k, k') 
matrix for Newman's theory must be of the general form 
(|C9[) , and therefore networks generated by his model 
must also be assortative. 



Appendix D: Example of correlation in colored-edge 
networks 



We consider a particular example of the non-trivial cor- 
relation structure of the colored-edge networks described 
in Section [IV] (and further analyzed in Appendix lEf . 
Consider a colored-edge network corresponding to the ex- 
ample ([7]) , where half the nodes are of degree k = 2 and 
half are of degree k — 3. We choose parameters a — 
and (3 = 1, which means that every k = 2 node has two 
green stubs, and every k = 3 node has 1 green and 2 
red stubs. Pairs of green stubs are chosen at random to 
form green edges, and similarly for red stubs/edges. The 
nearest-neighbor correlations are given by the P(k, k') 
matrix defined in (|C9|) ; for the parameters chosen here 
we have P(2,2) = 4/15, P(2,3) = P(3,2) = 2/15, and 
P(3,3) = 7/15. 

Let us now consider degree correlations beyond 
nearest-neighbors. Specifically, we choose a node of de- 
gree 3 and examine the fraction of its second neighbors 
which are also of degree 3 (ignoring cycles in the — > 00 
limit). We denote this quantity Q(3|3), as it is the prob- 
ability that node A has a second neighbor of degree 3, 
given that node A itself has degree 3. 

Since the degree distribution of first-neighbors of A is 
given exactly by 



P(fc|3) = 



P(M) 
E*< p(^3) 



for k = 2, 3, 



(Dl) 



it is tempting to calculate second-neighbor correlations 
under the Markovian assumption that the network is 
completely described by its P(k, k') distribution. This 

assumption underlies the calculation of the threshold we 

(2) 

denote as p)J , and if applied to our example would esti- 
mate the value of Q(3|3) by 



^P(3|fc')P(fc'|3) = 



55 
81' 



(D2) 



However, the coloring of the edges implies that the 
true nth-nearest-neighbor correlation structure is not ad- 
equately described by P(k,k') for n > 1. To show this, 
we now calculate the exact value of Q(3|3) and show that 
it differs from the Markovian-assumption estimate (|D2j) . 
First, note that since all k = 3 nodes have 1 green stub 
(as well as 2 red stubs) and all k = 2 nodes have 2 green 
stubs, travelling along a random green edge will lead to 
a k = 3 node with probability 1/3, and to a k = 2 node 
with probability 2/3. Similarly, travelling along a ran- 
dom red edge leads to a k = 3 node with probability 
1. 

Let us start at the k = 3 node called A, and enumerate 
all possible paths leading to degree-3 second neighbors of 
A, thus calculating Q(3|3). A fraction 1/3 of A's first 
neighbors are accessed via green edges, with the remain- 
ing fraction 2/3 being accessed by travelling along a red 
edge. Suppose first that we travel along a green edge 
from A. With probability 1/3 the green edge leads to a 
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k = 3 neighbor, otherwise the neighbor has k = 2. If 
the neighbor has k = 3, and noting that we arrived at 
him along a green edge, his connections to second neigh- 
bors of A are necessarily along red edges, and so these 
second neighbors have degree k — 3 with probability 1. 
On the other hand, if the first neighbor of A has k = 2, 
the access to A's second neighbor along this path must 
be along a green edge, and so the second neighbor found 
on this path is of degree 3 with probability 1/3. 

To summarize so far: starting from a k — 3 node A we 
can find degree-3 second neighbors of A by proceeding 

• along a green edge (prob 1/3) via a k — 3 first 
neighbor (prob 1/3) and then along a red edge 
(prob 1). Total probability: 1/9. 

• or, along a green edge (prob 1/3) via a k = 2 first 
neighbor (prob 2/3) and then along a green edge 
(prob 1/3). Total probability: 2/27. 

Similar arguments show that the remaining possible 
paths proceed from A 

• along a red edge (prob 2/3) via a first neighbor of 
degree-3 (prob 1) and then either along a red edge 
(prob 1/2) to a k = 3 node (prob 1), or along a 
green edge (prob 1/2) to a k = 3 node (prob 1/3). 
Total probability: 4/9. 



Summing over all possible paths we obtain 
Q(3|3) = 



2 

27 



17 

27' 



(D3) 



which differs from the value 55/81 obtained in (|D2I) un- 
der the Markovian approximation. We conclude that 
in colored-edge networks (and hence in the 7-thcory 
clustered networks) nth-nearest-neighbor correlations be- 
yond n = 1 are not completely described by the P{k, k') 
distribution under the Markovian assumption. 



Appendix E: Percolation in colored-edge networks 

We consider bond percolation in an unclustered net- 
work of N nodes (in the N — > 00 limit), composed of two 
types of edges (green or red) as described in Section ITVl 
Such networks may be created by considering a 7-theory 
network with only one non-trivial clique class c = c and 
with the internal c-clique edges colored red while the ex- 
ternal links are colored green, see Fig. H] for an example 
with c = 3. A similar idea is used in [2l| for Newman's 
triangle-based networks [l|. The total number of green 
stubs (half-edges) is 

Nj2{k-c+l)j(k,c) = Nj2k-/{k,l) + 

k,c k 

+A]T(fc-c+l) 7 (fc,c), (El) 



and the total number of red stubs is 
A^(c-l) 7 (fc,c), 



(E2) 



since any node with red stubs has exactly c — 1 of them. 
Green stubs are randomly linked to green stubs, and sim- 
ilarly for red stubs. As in 0, [HJ, we define a node as 
active if it is part of the GCC, and assume all nodes are 
initially inactive. Using a tree structure, define q g as the 
probability that a node with a green edge linking to its 
parent is active, and q r is the corresponding probability 
for a node with a red edge leading to its parent. Then 
standard arguments (see, for example, [1(3, [IB]) lead to 
the following self-consistent equations for q g and q r : 



q g = G{q g ,q r ) 
q r = R(q g ,q r ), 



(E3) 



where the functions G and R are defined as 
G{q g ,q r ) 



E(k — c+ l)"f(k, c) 
7 x 



R(q g ,q r 



x[l-(l-pq g ) k - c (l-pq r y- 1 ], (E4) 
j(k,c) 



X [l-(l- V q g ) k -- c +\ 



1-pq, 



\c-21 



.(E5) 



Similarly, the final density of active nodes, i.e., the GCC 
size, is given by 



S = J2l(k,c) [l-(l-pq g : 



fe-c+1 



(1 -pg r 



(E6) 



k.c 



The percolation threshold point is determined by stan- 
dard cascade condition arguments (2(| applied to the sys- 
tem (|E3j) -(|E5 j) . Defining B as the matrix 



B 



dG dG 

dq„ dq r 

an an 

dq g dq r 



(E7) 



q g =q T =Q 



which has elements 

B n = —^2(k-c+l)(k-c)-/(k,c) 



B\2 

B21 
B22 



k,c 

(c-1) 



^(fc-c+l) 7 (fc,c) 



c-2, 



(E8) 



the percolation threshold is given by prff* = 1/A r 
where X ma x is the larger of the eigenvalues of B, i.e., 



(ce) 

Pth = 



B\\ + B22 + \J (B\\ — B22) 2 + 4_Bi2-B2i 



(E9) 
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Since all the By elements are non-negative, we have the 
bound 



(ce) 



Pth 



< 



1 



(E10) 



(this follows by noting J {B\\ B22) 2 + ^Bi 2 B 2 i > 
B22 — Bn) which we will use below. 

Next we show that p^ < p)fi for networks of this 

type. From Eq. ([3]), note that p^iP is the solution of the 
polynomial equation H(p) = 1, where 



Hip) 



l)l(k,c) x 



x (p(k-c) + (« c -c+l).D c (p)) 
BnP+zr^jB 12 B 21 D-{p), (Ell) 



B above. Following the arguments of Appendix [Bj we 
will show that H ^p^) < 1 by using the bound ()B2|) on 
D-(p). This gives 



< B llP + B 12 B 21 



l-p(c-2) 



(E12) 



for all p such that < p < ■ 

Noting that B 22 = c — 2, we see from (|E10|) the in- 
equality < l/(c— 2) is obeyed and so we may apply 
(|ET2"1) with p 



(ce) 
Pth 



Substituting p = p^ e) from (|E9l 



(with l|E8p) into (IE12|) and simplifying yields 



< 1, 



(E13) 



and Bjj refers to the entries of the non-negative matrix 



and the result pfjf 1 < pij^ follows. 
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